-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feat: Add optional idempotency support to batches API #3171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add optional idempotency support to batches API #3171
Conversation
Implements optional idempotency for batch creation using `idem_tok` parameter: * **Core idempotency**: Same token + parameters returns existing batch * **Conflict detection**: Same token + different parameters raises HTTP 409 ConflictError * **Metadata order independence**: Different key ordering doesn't affect idempotency **API changes:** - Add optional `idem_tok` parameter to `create_batch()` method - Enhanced API documentation with idempotency extensions **Implementation:** - Reference provider supports idempotent batch creation - ConflictError for proper HTTP 409 status code mapping - Comprehensive parameter validation **Testing:** - Unit tests: focused tests covering core scenarios with parametrized conflict detection - Integration tests: tests validating real OpenAI client behavior This enables client-side retry safety and prevents duplicate batch creation when using the same idempotency token, following REST API
How to use the Graphite Merge QueueAdd either label to this PR to merge it via the merge queue:
You must have a Graphite account in order to use the merge queue. Sign up using this link. An organization admin has enabled the Graphite Merge Queue in this repository. Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue. |
llama_stack/apis/batches/batches.py
Outdated
endpoint: str, | ||
completion_window: Literal["24h"], | ||
metadata: dict[str, str] | None = None, | ||
idem_tok: str | None = None, # intentionally bad name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does this comment mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
idem_tok
is an intentionally bad name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fwiw, i would have just called it idempotency_key
, was the choice here because that's too verbose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exactly this. choice was to get feedback.
it's a non-standard param. chances are other APIs will follow the lead here.
idem_tok
-> idempotency_key
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i am preferential to idempotency_key
because I actually had to confirm that idempotency_token
was equivalent to it (which it is according to lots of google hits), so I imagine it's just convention exposure and key
was exposed to me first. token can be an unfortunately loaded term in our ai land fwiw.
# allowing us to detect parameter conflicts | ||
if idem_tok is not None: | ||
hash_input = idem_tok.encode("utf-8") | ||
hash_digest = hashlib.sha256(hash_input).hexdigest()[:24] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default batch id use's a 16 char hex section, is there a reason to use a different length here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
secret way to tell the difference. happy to align them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh that's fine then. i personally like to add a prefix for those reasons (OpenAI follows this but they don't expose an idempotency key) but different size is okay.
# For idempotent requests, use the idempotency token for the batch ID | ||
# This ensures the same token always maps to the same batch ID, | ||
# allowing us to detect parameter conflicts | ||
if idem_tok is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assumed we would generate the idempotency key on the server based on the concatenation of input metadata (similar to how we generate chunk_id
) and then store that in the DB, rather than expose it in the API.
could you elaborate on the rational to have it defined by the user of the client?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
openai's /v1/batches allows for creating duplicate batches. if we don't expose something and de-dup server side, we'll break semantics.
another approach i considered was user passed allow_duplicates, which defaulted to true.
but the common approach for this functionality is a user controlled key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
idempotency is always a client side concept. the client is telling the server the unique context under which the request was generated so that the server can dedupe things correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes you both are right of course. 🤦🏻
I forgot that this should be generated by the client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, some small nits
|
||
|
||
@pytest.fixture | ||
async def provider(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a rather non-specific name for a fixture which isn't actually that general. can you rename this batches_provider
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am merging this anyhow as I prepare towards a release.
Implements optional idempotency for batch creation using `idem_tok` parameter: * **Core idempotency**: Same token + parameters returns existing batch * **Conflict detection**: Same token + different parameters raises HTTP 409 ConflictError * **Metadata order independence**: Different key ordering doesn't affect idempotency **API changes:** - Add optional `idem_tok` parameter to `create_batch()` method - Enhanced API documentation with idempotency extensions **Implementation:** - Reference provider supports idempotent batch creation - ConflictError for proper HTTP 409 status code mapping - Comprehensive parameter validation **Testing:** - Unit tests: focused tests covering core scenarios with parametrized conflict detection - Integration tests: tests validating real OpenAI client behavior This enables client-side retry safety and prevents duplicate batch creation when using the same idempotency token, following REST API closes llamastack#3144
Implements optional idempotency for batch creation using
idem_tok
parameter:API changes:
idem_tok
parameter tocreate_batch()
methodImplementation:
Testing:
This enables client-side retry safety and prevents duplicate batch creation when using the same idempotency token, following REST API
closes #3144